Skip to content

Fix data corruption when creating batches.#11

Open
Wilfred wants to merge 1 commit intologstash-plugins:mainfrom
Wilfred:batching_causes_value_corruption
Open

Fix data corruption when creating batches.#11
Wilfred wants to merge 1 commit intologstash-plugins:mainfrom
Wilfred:batching_causes_value_corruption

Conversation

@Wilfred
Copy link
Copy Markdown
Contributor

@Wilfred Wilfred commented Mar 14, 2015

Previously, we were appending events to the event collection even when
the columns did not match!

Given the following three events:

{
  "name": "my-series",
  "columns": [
    "cpu.user",
    "time"
  ],
  "points": [
    [
      5.0,
      1426288467
    ]
  ]
}
{
  "name": "my-series",
  "columns": [
    "memory.used",
    "time"
  ],
  "points": [
    [
      1000,
      1426288467
    ]
  ]
}
{
  "name": "my-series",
  "columns": [
    "cpu.user",
    "time"
  ],
  "points": [
    [
      6.0,
      1426288477
    ]
  ]
}

We would end up with the following batch:

[
  {
    "name": "my-series",
    "columns": [
      "cpu.user",
      "time"
    ],
    "points": [
      [
        5.0,
        1426288467
      ],
      [
        6.0,
        1426288467
      ]
    ]
  },
  {
    "name": "my-series",
    "columns": [
      "memory.used",
      "time"
    ],
    "points": [
      [
        1000.0,
        1426288467
      ],
      [
        6.0, # this shouldn't be here!
        1426288467
      ]
    ]
  }
]

Previously, we were appending events to the event collection even when
the columns did not match!

Given the following three events:

```ruby
{
  "name": "my-series",
  "columns": [
    "cpu.user",
    "time"
  ],
  "points": [
    [
      5.0,
      1426288467
    ]
  ]
}
{
  "name": "my-series",
  "columns": [
    "memory.used",
    "time"
  ],
  "points": [
    [
      1000,
      1426288467
    ]
  ]
}
{
  "name": "my-series",
  "columns": [
    "cpu.user",
    "time"
  ],
  "points": [
    [
      6.0,
      1426288477
    ]
  ]
}
```

We would end up with the following batch:

```ruby
[
  {
    "name": "my-series",
    "columns": [
      "cpu.user",
      "time"
    ],
    "points": [
      [
        5.0,
        1426288467
      ],
      [
        6.0,
        1426288467
      ]
    ]
  },
  {
    "name": "my-series",
    "columns": [
      "memory.used",
      "time"
    ],
    "points": [
      [
        1000.0,
        1426288467
      ],
      [
        6.0, # this shouldn't be here!
        1426288467
      ]
    ]
  }
]
@brupm
Copy link
Copy Markdown

brupm commented May 29, 2015

I am not sure I understand this one. @Wilfred by columns do you mean inside the datapoints?

@Wilfred
Copy link
Copy Markdown
Contributor Author

Wilfred commented May 30, 2015

Yes, inside the datapoints we're taking values from the wrong event. Does that make sense?

@brupm
Copy link
Copy Markdown

brupm commented May 30, 2015

Yes it does.

@ghost
Copy link
Copy Markdown

ghost commented Nov 2, 2015

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants